Override missing intrinsincs in gcc <11 #572

meiravgri · 2024-12-17T09:32:15Z

There are missing implmnetation for several intrinsincs in gcc < 11

full list and suggested alternatives for each missing function can be found here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95483

The missing functions were added in gcc11.

in this PR we override _mm256_loadu_epi8 with mm256_maskz_loadu_epi8, using ~0 mask as recommended by the gcc team.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95483

alonre24 · 2024-12-17T09:52:54Z

src/VecSim/spaces/space_includes.h

@@ -16,6 +16,9 @@
 #if defined(__AVX512F__) || defined(__AVX__) || defined(__SSE__)
 #if defined(__GNUC__)
 #include <x86intrin.h>
+#if (__GNUC__ < 11)


Add an explanation or the link to the issue in a comment here

* [MOD-8198] Introduce INT8 distance functions (#560) * naive implementation of L2 * update * implment naive disatnce for int8 add cosine to spaces fix typos in calculator * imp choose L2 int8 with 256bit loop add spaces unit tests for int8 L2 add compilation flags introduce tests/utils for general utils * imp space bm for int8 change INITIALIZE_BENCHMARKS_SET to INITIALIZE_BENCHMARKS_SET_L2_IP introduce INITIALIZE_BENCHMARKS_SET_COSINE fix typos in Choose_INT8_L2_implementation_AVX512F_BW_VL_VNNI name * fix INITIALIZE_BENCHMARKS_SET_L2_IP and add include to F_BW_VL_VNNI * rename unit/test_utuils to unit_test_utils * seed create vec * format * implmenet IP + unit test * ip bm * format * implement cosine in ip API change create_int8_vec to populate_int8_vec add compute norm * use mask sub instead of msk load * loop size = 512 minimal dim = 32 * add int8 to bm * reanme to simd64 * convert to int before multiplication * review comments: align to vector size ncluding the norm in cosine dist unit test cover small dim in cosine chooser * use sizeof(float)instead of 4 * remove int conversion in test_utils::compute_norm * REVERT!!! malicious test to see if we get to the code * assert dummt * fix alignemnt test * remove assert * remove cosine alignment * Override missing intrinsincs in gcc <11 (#572) * override _mm256_loadu_epi8 with mm256_maskz_loadu_epi8 if gcc < 11 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95483 * fix * disable flow temp * add comment * [MOD-8200] [MOD-8202] INT8 index (#566) * naive implementation of L2 * update * implment naive disatnce for int8 add cosine to spaces fix typos in calculator * imp choose L2 int8 with 256bit loop add spaces unit tests for int8 L2 add compilation flags introduce tests/utils for general utils * imp space bm for int8 change INITIALIZE_BENCHMARKS_SET to INITIALIZE_BENCHMARKS_SET_L2_IP introduce INITIALIZE_BENCHMARKS_SET_COSINE fix typos in Choose_INT8_L2_implementation_AVX512F_BW_VL_VNNI name * fix INITIALIZE_BENCHMARKS_SET_L2_IP and add include to F_BW_VL_VNNI * rename unit/test_utuils to unit_test_utils * seed create vec * format * implmenet IP + unit test * ip bm * format * implement cosine in ip API change create_int8_vec to populate_int8_vec add compute norm * use mask sub instead of msk load * loop size = 512 minimal dim = 32 * add int8 to bm * reanme to simd64 * convert to int before multiplication * introduce IntegralType_ComputeNorm * move preprocessor logic to choose if cosine preprocessor is needed to CreateIndexComponents: pass bool is_normalized get distnce function according to original metric get pp according to is_normalized && metric == VecSimMetric_Cosine, and remove this logic from the indexes factories. add dataSize member to AbstractIndexInitParams add VecSimType_INT8 type introduce VecSimParams_GetDataSize: returns datasize introduce and implement GetNormalizeFunc<int8_t> thtat returns int8_normalizeVector int8_normalizeVector computes the norm and stores it at the emd of argument vector. * add int8 tests * fix include unint_test_utils * add int 8 to index factories remove normalize_func from VecSimIndexAbstract members tests: int8 unit test create int8 indexes unit_test_utils: CalcIndexDataSize: casts VecSimIndex * to VecSimIndexAbstract<dist_t, data_t> * and calls VecSimIndexAbstract<dist_t, data_t>::getDataSize() cast_to_tiered_index<data_t, dist_t>: takes VecSimIndex * ans casts to TieredHNSWIndex<data_t, dist_t> * * add EstimateInitialSize for int8 to indexes factories 2 new function to test_utils:: CreateTieredParams CreateNewTieredHNSWIndex add test_initial_size_estimation to CommonTypeMetricTests use CommonTypeMetricTieredTests for tiered tests * add int8 unit tests add int8 to * VecSimDebug_GetElementNeighborsInHNSWGraph * VecSim_Normalize *HNSW NewIndex from file * remove duplicated GetDistFunc<int8_t, float> move ASSERT_DEBUG_DEATH of CalcIndexDataSize to a separate test * remove assert test, the statement is excuted and causes crash * imporve normalize test * rename test_utils::compute_norm -> test_utils::integral_compute_norm remove test_normalize.cpp file * use stack allocation instead of heap allocation in tests * fix float comparison in test_serialization avoid evaluating statement in typeid to avoid clang warnig * renae CalcIndexDataSize -> CalcVectorDataSize move components tests from test_common to test_components * add comment to INSTANTIATE_TEST_SUITE_P * [MOD-8206] INT8 flow tests (#573) * test_hnsw.py intiital * int8 hnsw tests * general tests class * flow_bruteforce.py: introduce GeneralTest call from TestINT8 common.py: introduce create_flat_index create_add_vectors move fp32_expand_and_calc_cosine_dist to common.py * tiered flow tests: * add optional create_data_func to IndexCtx, use for special datatypes *inntroduce test_create_int8 and test_search_insert_int8 create_int8_vectors expectes shape (tuple) * use query.flat * revert using flat (not helping in int8) fix float16 calling query.flat * revert changes in Data class in bf tests revert test_bf_float16_range_query change * fix merge

* [MOD-8198] Introduce INT8 distance functions (#560) * naive implementation of L2 * update * implment naive disatnce for int8 add cosine to spaces fix typos in calculator * imp choose L2 int8 with 256bit loop add spaces unit tests for int8 L2 add compilation flags introduce tests/utils for general utils * imp space bm for int8 change INITIALIZE_BENCHMARKS_SET to INITIALIZE_BENCHMARKS_SET_L2_IP introduce INITIALIZE_BENCHMARKS_SET_COSINE fix typos in Choose_INT8_L2_implementation_AVX512F_BW_VL_VNNI name * fix INITIALIZE_BENCHMARKS_SET_L2_IP and add include to F_BW_VL_VNNI * rename unit/test_utuils to unit_test_utils * seed create vec * format * implmenet IP + unit test * ip bm * format * implement cosine in ip API change create_int8_vec to populate_int8_vec add compute norm * use mask sub instead of msk load * loop size = 512 minimal dim = 32 * add int8 to bm * reanme to simd64 * convert to int before multiplication * review comments: align to vector size ncluding the norm in cosine dist unit test cover small dim in cosine chooser * use sizeof(float)instead of 4 * remove int conversion in test_utils::compute_norm * REVERT!!! malicious test to see if we get to the code * assert dummt * fix alignemnt test * remove assert * remove cosine alignment * Override missing intrinsincs in gcc <11 (#572) * override _mm256_loadu_epi8 with mm256_maskz_loadu_epi8 if gcc < 11 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95483 * fix * disable flow temp * add comment * [MOD-8200] [MOD-8202] INT8 index (#566) * naive implementation of L2 * update * implment naive disatnce for int8 add cosine to spaces fix typos in calculator * imp choose L2 int8 with 256bit loop add spaces unit tests for int8 L2 add compilation flags introduce tests/utils for general utils * imp space bm for int8 change INITIALIZE_BENCHMARKS_SET to INITIALIZE_BENCHMARKS_SET_L2_IP introduce INITIALIZE_BENCHMARKS_SET_COSINE fix typos in Choose_INT8_L2_implementation_AVX512F_BW_VL_VNNI name * fix INITIALIZE_BENCHMARKS_SET_L2_IP and add include to F_BW_VL_VNNI * rename unit/test_utuils to unit_test_utils * seed create vec * format * implmenet IP + unit test * ip bm * format * implement cosine in ip API change create_int8_vec to populate_int8_vec add compute norm * use mask sub instead of msk load * loop size = 512 minimal dim = 32 * add int8 to bm * reanme to simd64 * convert to int before multiplication * introduce IntegralType_ComputeNorm * move preprocessor logic to choose if cosine preprocessor is needed to CreateIndexComponents: pass bool is_normalized get distnce function according to original metric get pp according to is_normalized && metric == VecSimMetric_Cosine, and remove this logic from the indexes factories. add dataSize member to AbstractIndexInitParams add VecSimType_INT8 type introduce VecSimParams_GetDataSize: returns datasize introduce and implement GetNormalizeFunc<int8_t> thtat returns int8_normalizeVector int8_normalizeVector computes the norm and stores it at the emd of argument vector. * add int8 tests * fix include unint_test_utils * add int 8 to index factories remove normalize_func from VecSimIndexAbstract members tests: int8 unit test create int8 indexes unit_test_utils: CalcIndexDataSize: casts VecSimIndex * to VecSimIndexAbstract<dist_t, data_t> * and calls VecSimIndexAbstract<dist_t, data_t>::getDataSize() cast_to_tiered_index<data_t, dist_t>: takes VecSimIndex * ans casts to TieredHNSWIndex<data_t, dist_t> * * add EstimateInitialSize for int8 to indexes factories 2 new function to test_utils:: CreateTieredParams CreateNewTieredHNSWIndex add test_initial_size_estimation to CommonTypeMetricTests use CommonTypeMetricTieredTests for tiered tests * add int8 unit tests add int8 to * VecSimDebug_GetElementNeighborsInHNSWGraph * VecSim_Normalize *HNSW NewIndex from file * remove duplicated GetDistFunc<int8_t, float> move ASSERT_DEBUG_DEATH of CalcIndexDataSize to a separate test * remove assert test, the statement is excuted and causes crash * imporve normalize test * rename test_utils::compute_norm -> test_utils::integral_compute_norm remove test_normalize.cpp file * use stack allocation instead of heap allocation in tests * fix float comparison in test_serialization avoid evaluating statement in typeid to avoid clang warnig * renae CalcIndexDataSize -> CalcVectorDataSize move components tests from test_common to test_components * add comment to INSTANTIATE_TEST_SUITE_P * [MOD-8206] INT8 flow tests (#573) * test_hnsw.py intiital * int8 hnsw tests * general tests class * flow_bruteforce.py: introduce GeneralTest call from TestINT8 common.py: introduce create_flat_index create_add_vectors move fp32_expand_and_calc_cosine_dist to common.py * tiered flow tests: * add optional create_data_func to IndexCtx, use for special datatypes *inntroduce test_create_int8 and test_search_insert_int8 create_int8_vectors expectes shape (tuple) * use query.flat * revert using flat (not helping in int8) fix float16 calling query.flat * revert changes in Data class in bf tests revert test_bf_float16_range_query change * fix merge (cherry picked from commit babfbe0)

[MOD-8198] Introduce INT8 (#560) (#571) * [MOD-8198] Introduce INT8 distance functions (#560) * naive implementation of L2 * update * implment naive disatnce for int8 add cosine to spaces fix typos in calculator * imp choose L2 int8 with 256bit loop add spaces unit tests for int8 L2 add compilation flags introduce tests/utils for general utils * imp space bm for int8 change INITIALIZE_BENCHMARKS_SET to INITIALIZE_BENCHMARKS_SET_L2_IP introduce INITIALIZE_BENCHMARKS_SET_COSINE fix typos in Choose_INT8_L2_implementation_AVX512F_BW_VL_VNNI name * fix INITIALIZE_BENCHMARKS_SET_L2_IP and add include to F_BW_VL_VNNI * rename unit/test_utuils to unit_test_utils * seed create vec * format * implmenet IP + unit test * ip bm * format * implement cosine in ip API change create_int8_vec to populate_int8_vec add compute norm * use mask sub instead of msk load * loop size = 512 minimal dim = 32 * add int8 to bm * reanme to simd64 * convert to int before multiplication * review comments: align to vector size ncluding the norm in cosine dist unit test cover small dim in cosine chooser * use sizeof(float)instead of 4 * remove int conversion in test_utils::compute_norm * REVERT!!! malicious test to see if we get to the code * assert dummt * fix alignemnt test * remove assert * remove cosine alignment * Override missing intrinsincs in gcc <11 (#572) * override _mm256_loadu_epi8 with mm256_maskz_loadu_epi8 if gcc < 11 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95483 * fix * disable flow temp * add comment * [MOD-8200] [MOD-8202] INT8 index (#566) * naive implementation of L2 * update * implment naive disatnce for int8 add cosine to spaces fix typos in calculator * imp choose L2 int8 with 256bit loop add spaces unit tests for int8 L2 add compilation flags introduce tests/utils for general utils * imp space bm for int8 change INITIALIZE_BENCHMARKS_SET to INITIALIZE_BENCHMARKS_SET_L2_IP introduce INITIALIZE_BENCHMARKS_SET_COSINE fix typos in Choose_INT8_L2_implementation_AVX512F_BW_VL_VNNI name * fix INITIALIZE_BENCHMARKS_SET_L2_IP and add include to F_BW_VL_VNNI * rename unit/test_utuils to unit_test_utils * seed create vec * format * implmenet IP + unit test * ip bm * format * implement cosine in ip API change create_int8_vec to populate_int8_vec add compute norm * use mask sub instead of msk load * loop size = 512 minimal dim = 32 * add int8 to bm * reanme to simd64 * convert to int before multiplication * introduce IntegralType_ComputeNorm * move preprocessor logic to choose if cosine preprocessor is needed to CreateIndexComponents: pass bool is_normalized get distnce function according to original metric get pp according to is_normalized && metric == VecSimMetric_Cosine, and remove this logic from the indexes factories. add dataSize member to AbstractIndexInitParams add VecSimType_INT8 type introduce VecSimParams_GetDataSize: returns datasize introduce and implement GetNormalizeFunc<int8_t> thtat returns int8_normalizeVector int8_normalizeVector computes the norm and stores it at the emd of argument vector. * add int8 tests * fix include unint_test_utils * add int 8 to index factories remove normalize_func from VecSimIndexAbstract members tests: int8 unit test create int8 indexes unit_test_utils: CalcIndexDataSize: casts VecSimIndex * to VecSimIndexAbstract<dist_t, data_t> * and calls VecSimIndexAbstract<dist_t, data_t>::getDataSize() cast_to_tiered_index<data_t, dist_t>: takes VecSimIndex * ans casts to TieredHNSWIndex<data_t, dist_t> * * add EstimateInitialSize for int8 to indexes factories 2 new function to test_utils:: CreateTieredParams CreateNewTieredHNSWIndex add test_initial_size_estimation to CommonTypeMetricTests use CommonTypeMetricTieredTests for tiered tests * add int8 unit tests add int8 to * VecSimDebug_GetElementNeighborsInHNSWGraph * VecSim_Normalize *HNSW NewIndex from file * remove duplicated GetDistFunc<int8_t, float> move ASSERT_DEBUG_DEATH of CalcIndexDataSize to a separate test * remove assert test, the statement is excuted and causes crash * imporve normalize test * rename test_utils::compute_norm -> test_utils::integral_compute_norm remove test_normalize.cpp file * use stack allocation instead of heap allocation in tests * fix float comparison in test_serialization avoid evaluating statement in typeid to avoid clang warnig * renae CalcIndexDataSize -> CalcVectorDataSize move components tests from test_common to test_components * add comment to INSTANTIATE_TEST_SUITE_P * [MOD-8206] INT8 flow tests (#573) * test_hnsw.py intiital * int8 hnsw tests * general tests class * flow_bruteforce.py: introduce GeneralTest call from TestINT8 common.py: introduce create_flat_index create_add_vectors move fp32_expand_and_calc_cosine_dist to common.py * tiered flow tests: * add optional create_data_func to IndexCtx, use for special datatypes *inntroduce test_create_int8 and test_search_insert_int8 create_int8_vectors expectes shape (tuple) * use query.flat * revert using flat (not helping in int8) fix float16 calling query.flat * revert changes in Data class in bf tests revert test_bf_float16_range_query change * fix merge (cherry picked from commit babfbe0) Co-authored-by: meiravgri <109056284+meiravgri@users.noreply.github.com>

meiravgri added 2 commits December 17, 2024 09:24

override _mm256_loadu_epi8 with mm256_maskz_loadu_epi8 if gcc < 11

3230523

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95483

fix

b6c713a

meiravgri requested review from alonre24 and GuyAv46 December 17, 2024 09:44

disable flow temp

7597870

alonre24 approved these changes Dec 17, 2024

View reviewed changes

alonre24 reviewed Dec 17, 2024

View reviewed changes

meiravgri changed the title ~~Meiravg_fix_unsupported_intrinsinc_for_gcc_10~~ Override missing intrinsincs in gcc <11 Dec 17, 2024

add comment

393c259

meiravgri merged commit e38fc0b into meiravg_feature_int_uint_8 Dec 17, 2024
10 checks passed

meiravgri deleted the meiravg_fix_unsupported_intrinsinc_for_gcc_10 branch December 17, 2024 09:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Override missing intrinsincs in gcc <11 #572

Override missing intrinsincs in gcc <11 #572

Uh oh!

meiravgri commented Dec 17, 2024 •

edited

Loading

Uh oh!

alonre24 Dec 17, 2024

Uh oh!

Uh oh!

Uh oh!

Override missing intrinsincs in gcc <11 #572

Override missing intrinsincs in gcc <11 #572

Uh oh!

Conversation

meiravgri commented Dec 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alonre24 Dec 17, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

meiravgri commented Dec 17, 2024 •

edited

Loading